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Abstract. There is a sequence of random numbers xi,X2. ■ ■ ,Xn and so on. 
Numbers are independent of each other, but all numbers are from the same 
continuous distribution. If < Xi > Xi^i, then Xi is a local maximum. 

Here, we show that the probability mass function (PMF) fm{d) of distances 
d between local maxima is non-parametric and the same for any probability 
distribution of random numbers in the sequence, and that the average dis- 
tance is exactly 3. We present a method of computation of this PMF and its 
table for distances betwen 2 and 29. This PMF is confirmed to match dis- 
tance distributions of sample random number sequences, which were created 
by pseudo-random number generators or obtained from " true" random number 
sources. 



1. Average distnace between local maxima 

Let's take any number in the sequence and find out the probabihty that it's a local 
maximum. 

Definition 1. A number Xi is a local maximum, if the following condition is true 

Xi — \ <C Xi ^ XiJ^\. 

First, we'll use a combinatorial approach. Consider the following sequence (Fig- 
ure 1) of pseudo-random numbers generated by MS Excel RAND() function: 
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489033299 


(12) 





912367351 
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(15) 





3780448 
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(16) 0.55317569 

(17) 0.6308772 

(18) 0.373163479 

(19) 0.812434426 

(20) 0.560173882 
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Figure 1 . Sample random sequence 

Let's take any three consequitive numbers Xi-i^ Xi and 2:^+1. For instance, for 
i = 3 we have: X2 = 0.191531578, = 0.429049655 and X4 = 0.308968021. In 
this case, x^ is a local maximum. If we denote the greatest value as 2, the least 
value as and the value in the middle as 1, then we have a triplet (0,2,1). Any 
three consequitive numbers can be represented by such triplet. Out of six possible 
permutations 

(0,1,2), (0,2,1), (1,0,2), (1,2,0), (2,0,1) and (2,1,0) 
we are interested only in two combinations, which represent local maxima 

(0,2,1) and (1,2,0). 

Therefore, if we take any three consequitive numbers, then the probability that the 
number in the middle is a local maximum is 

_ 2 _ 1 
6 3 

If we have 3 • N numbers in the sequence, then N of them are local maxima. It also 
means that the average distance between maxima should be exactly 3. 

Now, let's introduce some additional notation and use an operator approach. 
When Xi-i < Xi we'll put an operator U, i.e. the sequence goes "up" . Alternatively, 
when Xi-i > Xi we'll put an operator D, i.e. the sequence goes "down". For our 
sample sequence, the corresponding operator sequence is: 

(1) 0.935536495 

(2) 0.191531578 D 
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(3) 





429049655 U 


(4) 





308968021 D 


(5) 





179540986 D 


(6) 





401329789 U 


(7) 





71581906 U 


(8) 





604617962 D 


(9) 





877254876 U 


(10) 





973280207 U 


(11) 





489033299 D 


(12) 





912367351 U 


(13) 





604552972 D 


(14) 





039395302 D 


(15) 





3780448 U 


(16) 





55317569 U 


(17) 





6308772 U 


(18) 





373163479 D 


(19) 





812434426 U 


(20) 





560173882 D 



We can apply our new notation to triplets and see that 

(0,1,2) becomes {UU) 
(0,2,1) and (1,2,0) become {UD) 
(1,0,2) and (2,0,1) become {DU) 
(2,1,0) becomes {DD) 
Having a new notation, wc can use it to compute the probability of appearance of a 
local maximum in the middle of any triplet. We are interested in triplets represented 
by {UD), because it's the only expression, which represents a local maximum in 
the middle of a triplet. Wc shall use a standard cumulative distribution function 
(CDF) F{x), defined as a probability oi x < Xi: 

F{xi) = Pr{x <Xi)= j ' fix) ■ dx 

J — CO 

, where f{x) is PDF (probability density function) of x. This also could be written 
as 

F{xi)= / dF{x) 
Jo 

On the other hand, probability of a; > a;, is 

/ dF{x) = 1 - F{xi) 

JF{xi) 

Now, we can write the following formula 

/•I /•! fF{xi) 

{UD) = / dF{xi-i) ■ / dF{xi) ■ / dF{xi+i) 

Jo JF{xi-i) Jo 

The first integral declares that the first number in a triplet {UD) can have any 
value. The next integral says that the number in the middle of a triplet should be 
greater than the first number. Finally, the third integral is for the trailing number 
of a triplet, which should be less than the previous number. It's easy to compute 
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the probability as follows 

/ dF{xi-i)- / dF{xi)- / dF{xi+i) = / dF{xi-i)- / dF{xi)-F{xi) 

Jo JF(xi-i) Jo Jo JF(xi-i) 

f dF{xi.i) ■ f dF{xi) ■ F{xi) = C dF{xi.i) ■ - r^fcil!) 

Jo JF{xi-i) Jo Z Z 

rdFix...).i'--^^) 

Jo ^ ' ^' ^2 2 ^263 

We got the same number in both combinatorial and operator approaches. 

2. PMF OF DISTANCES BETWEEN LOCAL MAXIMA 

Now, it's time to advance our notation. Let's break up (UD) into pieces. 
Definition 2. Operators {,U,D and ) are defined as 

ii^ix) = f di;ix) 







U-^{x)= f 

Jz(x) 



d'4){x) ■ ip{x) 

z{x) 
l-z{x) 

D ■ 7p{x) = / di){x) ■ 7p{x) 
Jo 

> = 1 

Armed with this notation let's look at any quintet of numbers from i — 1 to 

i + 3. If it happens so that the numbers come like (0,2,1,4,3), then we got two local 
maxima x, = 2 and Xi+2 = 4. The distance between these maxima is {i + 2) — i = 2. 
This quintet can be represented by an expression (UDUD). Such quintet in our 
sample sequence can be found at i = 10: {xq = 0.877254876, Xio = 0.973280207, 
xii = 0.489033299, xi2 = 0.912367351, xis = 0.604552972). 

Definition 3. // Xi is a local maximum, and the next nearest maximum is the 
number xj, the the distance between maxima is j — i. 

Now, we can compute the probability of the distance between local maxima equal 
to 2 

(2.1) PK^ = 2) = /.(2) = i^^ = i^ 

, where is a distance between local maxima and fm{d) is a probability mass 
function (PMF) of the distribution of these distances. 

Notice the denominator. It is necessary to divide the probability of the quin- 
tet by the probability of the maximum in its first three numbers (triplet). Con- 
sider the quintet (xm = 0.973280207,a;ii = 0.489033299, xi2 = 0.912367351,a;i3 = 
0. 604552972, a;i4 = 0.039395302) from the sample sequence above. Its operator ex- 
pression is {DUDD), which doesn't seem to represent two maxima on distance 2. 
Let's add the number xg = 0.877254876 and look at the resulting sextet. This 
sextet's operator expression is {UDUDD). It starts with {UDUD). Clearly, our 
original quintet is a part of a sextet with two maxima on distance 2. Therefore, we 
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have to take into account those quintets, which are not represented by {UDUD), 
but these quintets could be parts of sought quintets. Probabihty Pmax in denomi- 
nator includes those quintets, which would be left unaccounted otherwise. 

This formula can also be interpreted in terms of conditional probabilities as 
follows 



, where event An B is & quintet with one maximum in its head and one maximum 
in its tail, event B is the maximum in first three numbers of a quintet, and event 
A\B is two maxima on a given distance from each other. 

Using the same methodology it's easy to show that the probability of the distance 
3 is 

(2.3) Prid = 3) = /„(3) = (^^^^^) + jUD^m = jUDU'DH (UD^UD) 

^max ^max 

In order to see why there are two terms in the numerator, consider these two sextets 
(0,2,1,3,5,4) and (0,3,2,1,5,4). The following is the table with formulae for the next 
3 distances 



(2.4) Pr(. = 4) = /U4) = (^^^^^^) + i^D'D) 



(2.5) Pr(d = 5) = /„(5) = 



p 

max 

(UDU^D) + (UD^U^D) + (UD^U'^D) + {UD'^UD) 



_ {UDU^D) + jUD'^U^D) + (UD^U^D) + {UD'^U'^D) + (UD^UD) 

(2.D) /to(OJ — p 

3. Results 

In order to compute a probability of a given distance between maxima, we have 
to identify corresponding integrals, evaluate them and sum them up. For example, 
computing the probability of distance 4 involves evaluation, see equation 2.4. A 
simple analytical expression for the sums of integrals in numerators of the above 
probabilities was presented in [6] (see equation 3.8): 

(3., HO=.-™ti. 

In [6] a set of similar problems are studied, e.g. permutation generated random 
walks, by using a different and more generic approach. However, the equation 3.1 
can be used to to derive the probability of distances between local maxima: 

Prnax (d + 3)! 

We were not aware of this work, and in absence of a simple analytical expression 
for a sum of integrals in the PMF equations (such as 2.4), we wrote a Java program, 
which docs all required work. First, it generates the necessary integrals using our 
operator notaion, e.g. (UDU^D) + {UD'^U'^D) + (UD^D). Next, it evaluates the 
corresponding integrals and sums symbolically. 



6 



ARGYN KUKETAYEV 



The resulting PMF table is shown in table 1. Variance of this distribution 
« 1.167168 and the standard deviation w 1.08. 

Table 1 . Table of PMF of distances between local maxima 



Distance 


Probability 


Decimal Approximation 
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1 A 
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1 /97'?(^/1S'?7Pi 


o.D04oz4D4iy / oyDoj2j-y 


16 


2/4583103525 


4.363855167334454E-10 


17 


8/162820783125 


4 91 3377669887681 E- 11 


18 


4/764299911375 


5.2335476433640715E-12 


19 


2/3781060408125 


5.289521414950853E-13 


20 


4/78642438249375 


5.0863122876684074E-14 


21 


2/428772250281375 


4.664480965565126E-15 


22 


8/19566987612046875 


4.088518968077954E-16 


23 


2/58274046742786875 


3.432059573325511E-17 


24 


4/1447106344699640625 


2.7641368684830376E-18 


25 


8/37392513326621578125 


2.1394657080470717E-19 


26 


8/501914364595623354375 


1.5938973985025013E-20 


27 


4/3494761822449632109375 


1.1445701318770342E-21 


28 


8 / 100847608441898396203125 


7.932761246003232E-23 


29 


1 /188217886723358757890625 


5.312991328341671E-24 


Total 


2722885427931256697484374 
2722885427931256697484375 


1 - 3.6725746509274224E-25 



We tested validity of a computed PDF table on several random and pseudo- 
random number sequences. For pseudo-random number sequences we used Java's 
standard pseudo-random generator Java. lang. Random and Daniel Cer's Java im- 
plementation [1] of notorious RANDU generator [2]. We used Mads Haahr's True 
Random Number Service web site [3] as a source of "true" random numbers. Wc 
modified the supplied Java client, which connects to the server and retrieves the 
true random number batches. We generates random number sequences using these 
methods and compared them with the theoretical PMF using several tests such as 
Kolmogorov-Smirnov and goodness of fit tests, see chapters 1.3.5.15 and 1.3.5.16 
in [4]. Also, according to the central limit theorem in large samples the standard 
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Table 2 . Table of CDF of distances between local maxima 



Distan('(^ Cumulative' Probability 

d F,„ ((/) 



Deciuial Approximation 



2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 



2/5 

11/15 

19/21 

34/35 

134/135 

4717/4725 

5773/5775 

31183/31185 

184273/184275 

4729717/4729725 

16372121/16372125 

30405374/30405375 

241215974/241215975 

32564156609/32564156625 

36395233873/36395233875 

343732764373 /343732764375 

3419236445623/3419236445625 

142924083427117/142924083427125 

782679504481871/782679504481875 

4482618980214373/4482618980214375 

53596531285171873/53596531285171875 

5341787618088796859 
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mm 
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2722S85427931 25fifi97484375 



0.4 

0.7333333333333333 

0.9047619047619048 

0.9714285714285714 

0.9925925925925926 

0.9983068783068783 

0.9996536796536797 

0.9999358666025333 

0.9999891466558133 

0.9999983085697371 

0.9999997556822954 

0.9999999671110782 

0.9999999958543376 

0.9999999995086623 

0.9999999999450477 

0.9999999999941815 

0.999999999999415 

0.999999999999944 

0.9999999999999949 

0.9999999999999996 

w 1.0 



deviation of the average distance between maxima should approach where a 
is the standard deviation of the distances in the population and n is the size of 
the sample [5]. We used this feature to compare sample average distances to a 
theoretical average distance 3. 

As expected, Java. lang. Random's and "true" random sequences were consistent 
with our PMF on any sample sizes varying from 100 to 100,000,000. Surprisingly, 
RANDU-generated sequences were also compliant with this PMF. When deriving 
this PMF, we assumed that numbers in the sequences are independent of each other. 
RANDU generator has a well known deficiency: its numbers are not independent. 
However, as it was noted before, it fared well in our tests. 

The table 3 shows sample statistics for "true" random and RANDU generated 
sequences compared to theoretical frequencies of distances between maxima. Both 
samples are distributed as predicted by theoretical PMF /,„ (d) , they pass good- 
ness of fit test with higher than 0.99 probabilities. Their average distances are also 
within the 3 • a area of a theoretical value of 3. The p- value for the latter test is 
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the probability of the deviation from the theoretical average distance greater than 
of the observed value. 

Table 3. Sample Frequency Comparison 



Distance 


Theoretical 


True Random 


RANDU 




Frequency 


Frequency 


Frequency 


2 


40000 


39803 


40462 


3 


33333 


33544 


33003 


4 


17143 


17119 


17073 


5 


6667 


6673 


6545 


6 


2116 


2139 


2157 


7 


571 


549 


571 


8 


135 


136 


148 


9 


28 


31 


36 


10 


5 


3 


5 


11 


1 


2 





12 





1 





Average 


3 


3.00187 


2.99447 


Std Dev of Mean 


0.0034 






p- value 




0.584 


0.106 


X",df 




1.006, 10 


1.386, 8 


p- value 




0.9998 


0.9944 



Conclusion 1. We constructed a simple method of computation of PMF of the dis- 
tribution of distances between local maxima in random number series. We confirmed 
that selected pseudo-random and true random number sequences are distributed ac- 
cording to this PMF. 
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